flec'd PCT/FTS lk APR ®® 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 

lllllllllllllllllill 



(19) World Intellectual Property 
Organization 
International Bureau 




(43) International Publication Date 
29 April 2004 (29.04.2004) 



(10) International Publication Number 

PCT WO 2004/036954 Al 



(51) International Patent Classification 7 : 



H04S 7/00 



(21) International Application Number: 

PCT/KR2003/002148 

(22) International Filing Date: 15 October 2003 (15.10.2003) 

(25) Filing Language: Korean 

(26) Publication Language: English 

(30) Priority Data: 

10-2002-0062956 15 October 2002 (15.10.2002) KR 
10-2003-0071344 14 October 2003 (14.10.2003) KR 

(71) Applicant (for all designated Stales except US): ELEC- 
TRONICS AND TELECOMMUNICATIONS RE- 
SEARCH INSTITUTE [KR/KR]; 161, Gajeong-dong, 
Yuseong-gu, Daejon 305-350 (KR). 

(72) Inventors; and 

(75) Inventors7Apphcants (for US only): SEO, JEONG 
IL [KR/KR]; #107-801 Sejong Apt., Jeonmin-dong, 
Yuseong-gu, Daejon 305-728 (KR), JANG, DAE 
YOUNG [KR/KR]; #101-1002 Hansol Apt., Song- 
gang-dong, Yuseong-gu, Daejon 305-503 (KR). KANG, 



KYEONG OK [KR/KR]; #101-605, Sarasungpuren 
Apt, Jeonmin-dong, Yuseong-gu, Daejon 305-727 (KR). 
KIM, JIN WOONG [KR/KR]; #305-1603, Expo Apt., 
Jeonmin-dong, Yuseong-gu, Daejon 305-761 (KR). Aim, 
Chieteuk [KR/KR]; #208-603 Expo Apt, Jeonmin-dong, 
Yuseong-gu, Daejon 305-761 (KR). 

(74) Agent: SHINSUNG PATENT FIRM; Haecheon Bldg., 
741-40, Yeoksam 1-dong, Kangnam-gu, Seoul 135-924 
(KR). 

(81) Designated States (national): AE; AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI, GB, GD, GE, 
GH, GM, HR, HU, ID, EL, IN, IS, JP, KE, KG, KP, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NI, NO, NZ, OM, PG, PH, PL, PT, RO, RU, SC, 
SD, SE, SG, SK, SL, SY, TJ, TM, TN, TR, TT, TZ, UA, 
UG, US, UZ, VC, VN, YU, ZA, ZM, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, BG, CH, CY, CZ, DE, DK, EE, 
ES, FI, FR, GB, GR, HU, EE, IT, LU, MC, NL, PT, RO, 

[ Continued on next page ] 



(54) Title: APPARATUS AND METHOD FOR ADAPTING AUDIO SIGNAL ACCORDING TO USER'S PREFERENCE 



-101 



-201 



Audio 
Metadata 



-203 



Audio 
Contents 



-103 



^213 



Audio Metadata 
Adaptation H-i 

Processing Unit 
1 



r 



215 



Audio Contents 

Adaptation 
Processing Unit 



105 



Audio Contents/ 
Metadata 
■ j- H Output Unit 



_ Adapted Audio 
Contents/Metadata 



m 

as 



o 

o 



^207 

User 
Characteristics 

Information 
Management Unit 



217 



User 
Characteristics 
Information 
Input Unit 



User Natural 
Characteristics 
Information 



209 



User Natural 
Environment 
Information 
Management Unit 



219 



User Natural 
Environment 
Information 
Input Unit 



User Natural 
Environment 
Information 



_Z1 



107 



I ^11 

Audio Terminal 
Capability 
Information 
Management Unit 



221 



Audio Terminal 
Capability 
Information 
Input Unit 



Audio Terminal 
Capability 
Information 



(57) Abstract: Apparatus and method for adapting audio signal according to user's preference. The apparatus and method allows the 
user to provide the best experience of digital contents by adapting audio contents to the user's sound field preference. The apparatus 
includes an audio usage environment management unit and an audio adaptation unit for adapting audio contents associated with 
user's adaptation request. 



WO 2004/036954 Al I 1I1J11I1 1! III1II Hill IIHf Illl I II HI Hill 1IHI Ilflf HIS III! Ufllir Oil flli 



SE, SI, SIC TR), OAPI patent (BF, BJ, CF, CG, CI, CM, For Two-letter codes and other abbreviations, refer to the "Guid- 

GA, GN, GQ, GW, ML, MR, NE, SK S TD, TG). once Notes on Codes and Abbreviations" appearing at the begin- 

_ nine of each regular issue of the PCT Gazette, 

Published: S J o J 

— with international search report 



10/531635 

WO 2004/036954 PCT/KR2003/002148 

JC1 3 Rsc'd PCT/PTO 1 4 APR 2003 

APPARATUS AND METHOD FOR ADAPTING AUDIO SIGNAL ACCORDING TO 

USER 7 S" PREFERENCE 

Description 

• 5 Technical Field 

The present invention relates to an . audio signal 
adaptation apparatus and a method thereof; and, more 
particularly, to an apparatus- for adapting an audio signal 
10 to user's preference and a method thereof. 

Background Art 

Moving Picture Experts Group (MPEG) has presented 

15 digital item adaptation ( DIA) , which is a new standard 
working item. A digital item (DI) means a structured 
digital object with a standard representation, 
identification and metadata, and DIA indicates a process 
for generating an adapted DI which is obtained after 

20 processed in a resource adaptation engine or descriptor 
adaptation engine . 

Here, resource means an item that can be identified 
individually, such as video or audio, image or texture and 
the like. A descriptor means information related to an 

25 item or a component in the DI . Also, a user includes a 
producer, a rightful person, a distributor and a consumer 
all. Media resource stands for a content that can be 
expressed digitally immediately. Hereinafter, the word 
'content' is used in the same meaning of DI , media resource 

30 and resource. 

Conventional technologies have a problem- that they 
cannot provide a single-source multi-use environment, in 
which one single audio content can be adapted to different 
usage environments by using information on the usage 

35 environment where the audio content is consumed, such as 

1 
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user characteristics, natural environment of a user, and 
capability of a user terminal. 

"Single source" means one single content which is 
generated from a multimedia source, while "multi-use" means 
5 user terminals, each having a different usage environment, 
consume the "single source" adaptively to each usage 
environment . 

An ^advantage of the single-source multi-use is that 
one content can be provided in - diverse forms by re- 

10 processing the content adaptively to different usage 
environments. Further, the single-source multi-use can 
make a network bandwidth decreased or used effectively when 
the single source adapted to the diverse usage environments 
is provided to user terminals. 

15 Therefore, a content provider can reduce unnecessary 

cost that is generated when a plurality of contents are 
produced and transmitted to match audio signals with the 
diverse usage environments. A consumer of content also can 
overcome the spatial restriction of his/her environment and 

20 consume an optimal audio content that satisfies the hearing 
ability and preference of the content consumer. 

However, the prior art does not make the best use of 
the advantage of using, the single-source multi-use 
environment even in a universal multimedia access (UMA) 

25 environment . 

That is, the multimedia source transmits an audio 
content indiscriminately with no consideration for usage 
environment, such as user characteristics, natural 
environment of a user, and the capability of a user 

30 . terminal. Since the user tex-minal equipped with an audio 
player application., such as Windows Media Player, MP 3 
player, and Real Player, consumes the audio content whose 
form is as received from' the multimedia source, it is not 
suitable for single-source multi-use environment. 

35 To overcome the problems of the prior art and support 
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' the single-source multi-use environment, the multimedia 
source provides multimedia contents in consideration of 
various usage environment. However, this brings in much 
load in the generation and transmission of contents. 

5 

Disclosure of Invention 

It is, therefore, an object of the .present invention 
to provide an audio adaptation apparatus and a method for 
10 adapting an audio content .suitably for usage environments 
by using information that describes the usage environments 
of user terminals. 

Those of ordinary skill in the art of the present 
invention will easily understand the other objects and 
15 advantages of the present invention from the drawings, 
detailed description of the invention, and claims of this 
specification. 

In accordance with one aspect of the present 
invention, there is provided an apparatus for adapting an 
20 audio signal for single-source multi-use, including: an 
audio usage environment information management unit for 
collecting, describing and managing audio usage environment 
information from each user terminal that consumes the audio 
signal; and an audio adaptation unit for adapting the audio 
25 signal so that the audio signal is outputted to the user- 
terminal suitably to the audio usage environment 
information, wherein the audio usage environment 
information includes user characteristics information that 
describes sound field preference of the user for the audio 
30 signal.' 

In accordance with another aspect of the present 
invention, there is provided a method for adapting an audio 
signal for single-source multi-use, including the steps of: 
a) collecting, describing and managing audio usage 
35 environment information from each user terminal that 

3 
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consumes the audio signal; and b) adapting the audio signal 
so that the audio signal is outputted to the user terminal 
suitably to the audio usage environment information, 
wherein the audio usage environment information includes 
5 user characteristics information that describes sound field 
preference of the user for the audio signal. 

Brief Description of Drawings 

10 The above and other objects and features of the 

present invention will become apparent from the following 
description of the preferred embodiments given in 
conjunction with the accompanying drawings, in which: 

Fig. 1 is a block diagram showing an outline of a user 

15 terminal including an audio signal adaptation apparatus in 
accordance with an embodiment of the present invention; 

Fig. 2 is a block diagram illustrating an audio 
adaptation apparatus in accordance with an embodiment of 
the present invention; 

20 Fig. 3 is a flowchart describing an audio signal 

adaptation process performed in the audio signal adaptation 
apparatus of Fig. 1; 

Fig. 4 is a flowchart illustrating the audio signal 
adaptation process of Fig. 3; 

25 Fig. 5 is a diagram showing that sound field 

characteristics preferred by a user are embodied through 
convolution of an audio content and an impulse response; 
and 

Fig. 6 is a graph describing the descriptors of 
30 perception parameters . 

Best Mode for Carrying Out the Invention 

Other objects and aspects of the invention will become 
35 apparent from the following description of the embodiments 
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with reference to the accompanying drawings, which is set 

forth hereinafter. 

Following description exemplifies only the principles 
of the present invention. Even if they are not described 
or illustrated clearly in the present specification, one of . 
ordinary skill in the art can embody the principles of the 
present invention and invent various apparatuses within the 
concept and scope of the present invention. 

The use of the conditional terms and embodiments 
presented in the present specification are intended only to 
make the concept of the present invention understood, and 
they are not limited to the embodiments and conditions 
mentioned in the specification. 

In addition, all the detailed description on the 
principles, viewpoints and embodiments and particular 
embodiments of the present invention should be understood 
to include structural and functional equivalents to them. 
The equivalents include not only currently known 
equivalents but also those to be developed in future, that 
is, all devices invented to perform the same function, 
regardless of their structures. 

For example, block diagrams of the present invention 
should be understood to show a conceptual viewpoint of an 
exemplary circuit that embodies the principles of the 
present invention. Similarly, all the flowcharts, state 
conversion diagrams, pseudo codes and the like can be 
expressed substantially in a computer-readable media, and 
whether or not a computer or a processor is described 
distinctively, they should be understood to express various 
30 processes operated by a computer or a processor. 

Functions of various devices illustrated in the 
drawings including a functional block expressed as a 
processor or a similar concept can be provided not only by 
using hardware dedicated to the functions, but also by 
using hardware capable of running proper software for the 

5 
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functions. When a function is provided by a processor, the 
function may be provided by a single dedicated processor, 
single shared processor, or a plurality of individual 
processors, part- of which can be shared. 
5 The apparent use of a term, ^processor' , ^control' or 

similar concept, should not be understood to exclusively 
refer to a piece of hardware capable of running software, 
but should be understood to include a digital signal 
processor (DSP), hardware, and ROM, RAM and non-volatile 
10 memory for storing software, implicatively . Other known 
and commonly used hardware may be included therein, too. 

In the claims of the present specification, an element 
expressed as a means for performing a function described in 
the detailed description is intended to include all methods 
15 for performing the function including all formats of 
software, such as combinations of circuits for performing 
the intended function, firmware /microcode and the like. 

To perform the intended function, the element is 
cooperated with a proper circuit for performing the 
20 software. The present invention defined by claims includes 
diverse means for performing particular functions, and the 
means are connected with each other in a method requested 
in the claims. Therefore, any means that can provide the 
function should be understood to be an equivalent to what 
25 is figured out from the present specification. 

Other objects and aspects of the invention will become 
apparent from the following description of the embodiments 
with reference to the accompanying drawings, which is set 
forth hereinafter. The same reference numeral is given to 
30 the same element, although the element appears in different 
drawings. In addition, if further detailed description on 
the related prior arts is determined to blur the point of 
the present invention, the description is omitted. 
Hereafter, preferred embodiments of the present invention 
35 will be described in detail with reference to the drawings. 
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Fig. 1 is a block diagram showing an outline of a user 
terminal including an audio signal adaptation apparatus in 
accordance with an embodiment of the present invention. 
The audio adaptation apparatus 100 includes an audio 
5 adaptation unit 103 and an audio usage environment 
information management unit 107. Each of the audio 
adaptation unit 103 and the audio usage environment 
information management unit 107 can be mounted on an . audio 
processing system independently. 

10 The audio processing system includes a laptop computer, 

a notebook computer, a desktop computer, a workstation, a 
mainframe computer or other types of computers. It also 
includes a data processing system or a * signal processing 
system, such as personal digital assistant (PDA) and a 

15 mobile communication station. 

The audio processing system may be one of the nodes 
that form a network path, e.g., a multimedia source node 
system, a multimedia relay node system, and an end user 
terminal. The end user terminal is equipped with an audio 

20 player, such as Windows Media Player, MP3 player and Real 
Player . 

For example, ■ when the audio adaptation apparatus 10 0 
is mounted on the multimedia source node system and 
operated, the audio adaptation apparatus 100 receives usage 

25 environment information from the end user terminal, adapt a 
content to the usage environment, and transmit the adapted 
content to the end user terminal. That is, it adapts the 
content suitably to the usage environment by using 
information on the usage environment where the audio 

30 content is consumed . 

The Technical Committee of the International Standard 
Organization ( ISO) /International Electrotechnical 

Commission (IEC) describes the functions and operations of 
the elements shown in the preferred embodiment of the 

35 present invention in its Standards Document. Therefore, 
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the Standards Document may be included as part of the 
present invention within the range that it helps 
understanding the technology of the present invention. 

An audio data source unit 101 receives audio data 
5 generated from the multimedia source. The audio data 
source unit 101 can be included in a multimedia source node 
system, or a multimedia relay node system or an end user 
terminal that receives the audio data transmitted from the 
multimedia source node system through a wired/wireless 
10 network . 

The audio adaptation unit 103 receives audio data from 
the audio data source unit 101. Then, an audio usage 
environment information management unit 107 adapts the 
audio data suitably to usage environment by using the usage 

15 environment information including information on user 
characteristics, natural environment of a user, and 
capability of user terminal. 

Here, the function of the audio adaptation unit 103 is 
not necessarily included in any one node system, but it can 

20 be dispersed in another node system that forms a network 
path. For example, an audio adaptation unit 103 with a 
function of controlling audio volume, which is not related 
to a network bandwidth, is included in an end user terminal, 
whereas an audio adaptation unit 103 with a function 

25 related to the network bandwidth, for example, a function 
of controlling audio level, that is, the intensity of a 
particular audio signal in. a time domain, can be included 
in a multimedia source node system. 

The audio usage environment information management 

30 unit 107 collects information from a user, a user terminal 
and natural environment of the user, and then describes and 
manages usage environment information in advance. 

Usage environment information related to a function 
performed by the audio adaptation . unit 103 can be dispersed 

35 in a node system on the network path, just as the audio 



WO 2004/036954 



PCT/KR2003/002148 



adaptation unit 10 3. 

The audio data output unit 105 outputs audio data 
adapted by the audio adaptation unit 103. The outputted 
audio data can "be transmitted to an audio player of an end 
5 user terminal, or transmitted to a multimedia relay node 
system or an end user terminal through a wired/wireless 
network . 

Fig. 2 is a block diagram illustrating an audio 
adaptation apparatus in accordance with an embodiment of 
10 the present invention. Referring to Fig. 2, the audio data 
source unit 101 includes audio metadata 201 and audio 
contents 203. 

The * audio data source unit 101 collects and stores 
audio contents 20 3 and audio metadata 201 generated by a 
15 multimedia source. Here, the audio contents 2 03 can be 
stored in various different encoding methods, e.g., MP 3 , 
AC-3, AAC , WMA, RA, CELP and the like, or they include 
diverse audio formats transmitted in the form of streaming. 

The audio metadata 201 are data related to an audio 
20 content, such as encoding method, sampling rate, the number 
of channels (e.g., mono, stereo, and 5.1 channel), and bit 
rate. They can be defined and described by extensible 
Markup Language (XML) schema. 

The audio usage environment information management 
25 unit 107 includes: a user characteristics information 
management unit 207, a user characteristics information 
input unit 217, a user natural environment information 
management unit 2 09, a user natural environment information 
input unit 219, an audio terminal capability information 
30 management unit 211, and an audio terminal capability 
information input unit 221. 

The user characteristics information management unit 
2 07 receives user characteristics information from a user 
terminal and manages it. The user characteristics 

35 information includes characteristics of hearing ability, 
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preferred audio volume, equalizing patterns on a preferred 
frequency spectrum and the like. In particular, the user 
characteristics information management unit 2 07 receives 
and manages information on a sound field preferred by the 
5 user. The inputted user characteristics information is 
managed in a language that can be readable mechanically, 
for example, a language of an XML form. 

The user natural environment information management 
unit 209 receives information on natural environment where 

10 the audio content is consumed through the user natural 
environment information input unit .219 and manages the 
natural environment information. The inputted natural 
environment information is managed in a language that can 
be readable mechanically, for example, a language of an XML 

15 form. 

The user natural environment information input unit 
219 transmits noise environment characteristics information 
that can be defined by a noise environment classification 
table to the user natural environment information 

20 management unit 209 . The noise environment classification 
table is predetermined or obtained by collecting data at a 
particular place and analyzing the data. 

The audio terminal capability information management 
unit 211 receives audio terminal capability information 

25 through the audio terminal capability information input 
unit 221 and manages it. The inputted audio terminal 
capability information is managed in a language that can be 
readable mechanically, for example, a language of an XML 
form. 

30 The audio terminal capability information input unit 

221 can transmit audio terminal capability information, 
which is predetermined in the user terminal or inputted by 
the user, to the audio terminal capability information 
management unit 211. 

35 The audio adaptation unit 103 can include an audio 

10 
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metadata- adaptation processing unit 213 and an audio 
contents adaptation processing unit 215. The audio 

contents adaptation processing unit. 215 parses the user 
natural environment information which is managed in the 
5 user natural environment information .management unit 209 
and performs transcoding so that the audio content could be 
adapted to the natural environment to thus survive the 
noise environment through audio signal processing, such as 
noise-masking . 

10 Similarly, the audio contents adaptation processing 

unit 215 parses the user characteristics information and 
the audio terminal capability information that are managed 
in the user characteristics information management unit 217 
and the audio terminal capability information management 
15 unit 211, respectively, an'd adapts audio . signals so that 
the audio content could be suitable to the user 
characteristics and the audio terminal capability. 

The audio metadata adaptation processing unit 213 
provides metadata needed for the audio content adaptation 
20 process and adapts the content of audio metadata that 
correspond to the result of the audio content adaptation. 

Fig. 3 is a flowchart describing .an audio signal 
adaptation process performed in the audio signal adaptation 
apparatus of Fig. 1- Referring to Fig. 3, the process of 
25 the present invention starts with the audio usage 
environment information management unit 107. 

At step S301, the audio usage environment information 
management unit 107 collets usage environment information 
of an audio content from the user, the mobile terminal and 
30 the natural environment and describes user characteristics 
information, user natural environment information and user 
terminal capability information in advance. At step S303, 
the audio data source unit 101 receives audio data. 

Subsequently, at step S305, the audio adaptation unit 
35 10 3 adapts the audio signals of the audio content, which 

11 
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are received at the step S303, suitably to the usage 
environment * information , e.g., the user characteristics , 
the user natural environment and the user terminal 
capability by using the usage environment information 
described at the step S301. At step S307 , the audio data 
output unit 105 outputs the audio data adapted at the step 
S305. 

Fig. 4 is a flowchart illustrating the audio signal 
adaptation process of Fig. 3. Referring to Fig. 4, at step 
S401, the audio adaptation unit 103 checks the audio 
content and the audio metadata received by the audio data 
source unit 101. Then, at step S403, it adapts the audio 
data to be adapted suitably to the user characteristics, 
the user natural environment, and the user terminal 
capability. 

Subsequently, at step S405, the audio adaptation unit 
103 adapts the content of the audio metadata for the audio 
content based on the result of the audio content adaptation 
at the step S403. Hereinafter, an architecture of 

description information managed by the audio usage 
environment information management unit 10 7 will be 
described. 

The information on the user characteristics, the user 
terminal capability and the characteristics of the natural 
environment should be managed in order to adapt the audio 
content suitably to the usage environment , where the audio 
content is consumed, by using usage environment information 
which is described in advance, such as the user 
characteristics, the user natural environment and the user 
terminal capability. 

Particularly, the user characteristics . information 
includes "AudioPresentationPreference" descriptors that 
describe the audio presentation preference of the user. 
The "AudioPresentationPref erence" descriptors that have 
been discussed in the Moving Picture Experts Group 21 

12 
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(MPEG-21) are "AudioPower ,/ , "Mute", ^FrequencyEqualizer // , 
Period" , "Level" , "PresetEqualizer" , "AudioFrequencyRange" , 
and "AudibleLevelRange" descriptors. 

The "AudioPower" descriptor shows a user's preference 
5 for loudness of audio. It is. described on a normalized 
percentage scale from 0 to 1 . The "Mute" descriptor shows 
the user's preference for the mute part of the audio in a 

digital device . 

The M FrequencyEqualizer" descriptor shows the user's 

10 preference for the unique concept of equalization using a 
frequency domain and a decay value. The "Period" 

descriptor is a feature of the "FrequencyEqualizer " 
descriptor and it defines the lower corner frequency and 
the upper corner frequency of an equalization range that is 

15 expressed in hertz (Hz). 

The "Level" descriptor is a feature of the 
"FrequencyEqualizer" descriptor and it defines 

amplification and decay values of a frequency range that is 
expressed in decibel (dB) on a scale of from -15 to 15. 

20 The "PresetEqualizer" descriptor indicates the user's 

preference for the unique concept of equalization through a 
linguistic technology of an equalizer preset. The preset 
is presented as jazz, rock, classical music and pop music. 
The "AudioFrequencyRange" descriptor shows the user's 

25 preference for a particular frequency area. It is 

expressed in hertz (Hz) from the lower corner frequency to 
the upper corner frequency. 

The "AudibleLevelRange" descriptor describes the 
user's preference for a particular level range. The 
30 highest value and the lowest value are given 1 and 0 
respectively . 

Meanwhile , the N "AudioPresentationPref erence" 

descriptors cannot describe the user's preference for sound 
field sufficiently. Therefore, a descriptor that can 
35 describe user preference information for a sound field is 

13 
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needed. So, the present invention suggests describing the 
preference for sound field at a particular place with an 
impulse response and perceptual parameters. 

For example, a sound field such as a hall or a church 
5 can be expressed by obtaining impulse response of a 
corresponding place with one or more microphones and 
convoluting the obtained impulse response with a 
corresponding audio content . 

Fig. 5 is a diagram showing that sound field 

10 characteristics preferred by a user are embodied through a 
convolution of an audio content and an impulse response. 
Referring to Fig. 5, the audio adaptation unit 103 
convolutes the impulse response and the audio, content so 
that the audio content could reflect the sound field 

15 characteristics of the user. 

The use of the impulse response makes it possible to 
describe the sound field of a consumed content most 
precisely, and the perceptual parameters express the 
feeling of audio signals perceived by the user, such as 

20 sound source warmth and heaviness of sound. 

Following is an architecture of technical information 
of usage environment managed by the audio usage environment 
information management unit 107 of Fig. 1. It shows an 
exemplary syntax expressing a sound field preferred by a 

25 user based on the definition of an XML. schema. 

<element name=" SoundFieldGenerator "> 
<sequence> 

<element name=" ImpulseResponse" minOccurs="0"> 
30 <complexType> 

<sequence maxOccurs=" unbounded "> 

<element name=" time" type=" float" /> 

<element name=" amplitude" type=" float" /> 

</sequence> 
35 </complexType> 
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</element> 

<element name="PerceptualParameters" minOccurs="0"> 
<sequence> 

<element name= " SourcePr esence M type= " f loat " /> 
5 <element name="SourceWarmth" type=" float " /> 

<element name= ,, SourceBrilliance n type=" float " /> 

<element name="RoomPresence" type=" float " /> 

<element name= " RunningReverberance " type= " float " /> 

<element name= " Envelopment " type=" float " /> 
10 <element name= M LateReverber ance " type=" float " /> 

<element name="Heavyness" type=" float " /> 

<element name="Liveness" type=" float" /> 

<element name="Ref Distance" type=" float " /> 

<element name= M FreqLow" type=" float "/> 
T5~ <eTemeht " nanie= ,r FreqHigh " t ype= ^f "lcrat-" f> 

<element name="Timelimitl " type=" float " /> 

<element name= "Time limit 2 v type=" float " /> 

<element name="Timelimit3 " type=" float " /> 

</element> 

20 

The descriptors of M ImpulseResponse" and the 
descriptors of "Perceptural Parameters'' describe an impulse 
response and perceptual parameters, respectively. The 
audio adaptation unit 103 adapts the audio data suitably to 

25 the sound field characteristics preferred by the user based 
on the descriptors of the ^ImpulseResponse" and the 
descriptors of the ^Perceptural Parameters''. 

As shown in the above XML code, an impulse response 
can be expressed . with a successive time value and an 

30 amplitude value. On the other hand, it is possible to 
replace the impulse response with a Uniform Resource 
Identifier (URI) address having impulse response 
characteristic information by considering the amount of 
data of the "ImpulseResponse 7 ' . 

35 Also, the user's preference for a sound field can be 

15 - ■ 
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reflected by adding additional descriptors, such as 
"SamplingFrequency", ■ "BitsPerSample" and "NumOf Channel '■' 
descriptors, along with the impulse response 
characteristics obtained 'from the URI address. The 
5 perceptual parameters use u PerceptualParameters ,/ ' 

descriptors of MPEG-4 Advanced AudioBIFS to describe a 
scene preferred by the user. For more description on each 
descriptor, "ISO/IEC 14496-1 : 1999" can be referred to. 

As shown in the above XML code, the 

10 "PerceptualParameters" includes : "SourcePresence" , 

"SourceWarmth" , "SourceBrilliance" , "RoomPresence" , 

"RunningReverberance" , "Envelopment" , "LateReverberance" , 
"Heavyness" , "Liveness" , "Ref Distance" , ' "FreqLow" , 

"FreqHigh", "Timelimit 1" , "Timelimit2", and "Timelimit3" 

15 descriptors. 

Fig. 6 is a graph describing the descriptors of 
"PerceptionParameters" . The "SourcePreserice" descriptor 
describes direct sound and 'the energy of early room effect 
in decibel. The "Source Warmth" descriptor describes the 

20 relative early energy at a low frequency in decibel. 

The "SourceBrilliance" descriptor describes the 
relative early energy at a high frequency in decibel. The 
"RoomPresence" descriptor describes the energy of later 
room effect in decibel. 

25 The "RunningReverberance" descriptor describes the 

relative early decay time in millisecond (ms). The 
"Envelopment" descriptor describes the energy of early room 
effect related to the direct sound in decibel. 

The "LateReverberance" descriptor describes late 

30 decay time in millisecond (ms). The "Heavyness" descriptor 
describes relative decay time at a low frequency. The 
"Liveness" descriptor describes relative decay time at a 
high frequency. 

The "Ref Distance" descriptor describes a reference 

35 distance that defines the perceptual parameters in meter 
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(m) . The "FreqLow" descriptor describes the limitation of 
a low frequency in hertz (Hz), as shown in Fig. 6". The 
"FreqHigh" descriptor describes the limitation of a high 
frequency in hertz (Hz), as m shown in Fig. 6. 
5 The "Timelimitl" descriptor describes the limitation 

(lx) of a first moment in millisecond (ms), as shown in Fig. 
6. The "Timelimit2" descriptor describes the limitation 
" (1 2 ) of a second moment in millisecond (ms), as shown in 
Fig. 6. The "Timelimit3" descriptor describes the 

10 limitation (1 3 ) of a third moment in millisecond (ms), as 
shown in Fig. 6. 

Just as the impulse response, the audio adaptation 
unit 103 reflects the sound field characteristics preferred 
by the user in the audio content based on the perceptual 
15 parameters. 

Further to the impulse response characteristics and 
the perceptual parameters, an "Auditor iumParameters ,/ 
descriptor can be added to obtain three-dimensional sound. 

The space where a content is consumed can be 
20 different according to users, even if the sound field 
characteristics preferred by users are the same. So, the 
restored content can have different sound field 
characteristics. Therefore, the audio adaptation unit 103 
removes adverse effects caused by user sound environment 
25 based on the " Auditor iumParameters" descriptor. 

Following is an architecture of technical information 
of a usage environment which is managed by the audio usage 
environment information management unit 107 of Fig. 1. It 
shows an exemplary syntax expressing the user sound 
30 environment based on XML schema definition. 

<element name= "AuditoriumParameters " minOccurs=" 0"> 
<sequence> 

<element name= " Reverberat ionTime " type= n float " 
35 min0ccurs="0'7> 
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<element name=" InitialDecayTime" type=" float" 
minOccurs=" 0 " /> 

<element name="RDRatio" type=" float" minOccurs="0"/> 
<element name= " Clarity" type=" float " minOccurs= " 0 " /> 
5 <element name=" IACC" type="f loat " minOccurs= ,, 0 , 7> 
</sequence> 
</element> 

The "AuditoriumParameters 77 uses "ReverberationTime" , 

10 "InitialDecayTime" , "RDRatio" , "Clarity" , and "IACC" 
descriptors to express the sound environment of a space 
where the user consumes the audio content. 

The "ReverberationTime" descriptor expresses 

reverberation time. It describes the time taken for 

15 decaying a sound level by 6 0 dB in millisecond. The 
reverberation time is expressed as RT or T60 and it is the 
most basic physical quantity that shows interior sound 
characteristics ... 

The "InitialDecayTime" descriptor expresses the 

20 initial decay time. It describes the time difference 
between the direct sound and the reflected sound in 
millisecond. The initial decay time is a physical quantity 
that shows the intimacy with a hall. It is also called IDT. 
The "RDRatio" descriptor describes the energy ratio 

25 of the direct sound and a reflected sound after 50 
milliseconds in per cent (%). The "RDRatio" descriptor is 
an information quantity that expresses a single sound and a 
wave form of the reverberation sound. It is a physical 
quantity that indicates clarity of a picture and it is 

30 called D50. 

The "clarity" descriptor describes the energy ratio 
of the direct sound and a reflected sound after 8 0 
milliseconds in per cent •(%). It is a basic physical 
quantity that indicates the clarity of music and it is 

35 called C80. 
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The "IACC" descriptor describes the maximum value 
that is obtained when an internal crosscorrelation function 
of an impulse response obtained at the left ear and the 
right ear is acquired in a range of from -1 ms to 1. ms . 
5 The VN IACC" descriptor is described in a range of from -1 to 
1. The XN IACC" descriptor ' shows similarity of sound that 
arrives at each ear of the listener. It is a physical 
quantity that indicates the sense of spread of the sound. 

The above descriptors represent the characteristics 
10 of the sound environment of the user. In accordance with 
the present invention, it is possible to provide a single- 
source multi-use environment where one audio content can be 
adapted suitably to the characteristics and tastes of 
various users in different usage environment by using sound 
15 • field information preferred by the users and the user sound 
environment information. 

While the present invention has been described with 
respect to certain preferred embodiments, it will be 
apparent to those skilled in the art that various changes 
20 and modifications may be made without departing from the 
scope of the invention as defined in the following claims. 
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What is claimed is; 

1 . An apparatus for adapting an audio signal for 
single-source multi-use , comprising : 

5 an audio usage environment information management 

means for collecting, describing' and managing audio usage 
environment information from each user terminal that 
consumes the audio signal; and 

an audio adaptation means for adapting the audio 
10 signal so that the audio signal is outputted to the user 
terminal suitably to the audio usage environment 
information , 

wherein the audio usage environment information 
includes user characteristics information that describes 
15 sound field preference of the user for the audio signal. 

2. The apparatus as recited in claim 1, wherein the 
user characteristics information includes preference for 
impulse response, and the audio adaptation means adapts the 

20 audio signal, and transmits the adapted audio signal to the 
user terminal by changing the sound field characteristics 
of the audio signal based on the preference for the impulse 
response . 

25 3. The apparatus as recited in claim 2, wherein the 

impulse response is described with time and amplitude. 

4. The apparatus as recited in claim 1, wherein the 
user characteristics information includes preference for 
3D perceptual parameters of the audio signal, and the audio 
adaptation means adapts the audio signal and transmits the 
adapted audio signal to the user terminal by changing the 
sound field characteristics of the audio signal based on 
the preference for the perceptual parameters. 

35 
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5. The apparatus as recited in claim 1, wherein the 
user characteristics information includes sound environment 
information of a space where the user consumes the audio 
signal, and the audio adaptation means adapts the audio 
signal and transmits the adapted audio signal to the user 
terminal by removing adverse effects caused by the sound 
environment - of the us'er among the sound field 
characteristics of the audio signal based on the sound 
environment information . 

6. The apparatus as recited in claim 5, wherein the- 
sound environment information includes reverberation time 
information of the space. 

7. The apparatus as recited in claim 5, wherein the 
sound environment information includes initial decay time 
of the space. 

8. The apparatus as recited in claim 5, wherein the 
20 sound environment information includes energy ratio 

information between direct sound of the space and reflected 
• sound after a predetermined time. 

9. The apparatus as recited in claim 5, wherein the 
25 sound environment information is a physical quantity that 

indicates the sense of sound spread and the sound 
environment information includes similarity information of 
sound that arrives at each ear of the user. 

30 10. A method for adapting an audio signal for 

single-source multi-use, comprising the steps of: 

a) collecting, describing and managing audio usage 
environment information from each user terminal that 
consumes the audio signal; and 

35 b) adapting the audio- signal so that the audio signal 
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is output-ted to the user terminal suitably to the audio 
usage environment information, 

wherein the audio usage environment information 
includes user characteristics information that describes 
5 sound field preference of the user for the audio signal. 

11. The method as recited in claim 10, wherein the 
user characteristics information includes preference for 
impulse response and, at the step b) , the audio signal is 

10 adapted and transmitted to the user terminal by changing 
the sound field characteristics of the audio signal based 
on the preference for the impulse response. 

12. The method as recited in claim 11, wherein the 
15 impulse response is described with time and amplitude. 

13. The method as recited in claim 10, wherein the 
user characteristics information includes preference for 
perceptual parameters of the audio signal and, at the step 

20 b), the audio signal is adapted and transmitted to the user 
terminal by changing the sound field characteristics of the 
audio signal based on the preference for the perceptual 
parameters . 

25 14.. The method as recited in claim 10, wherein the 

user characteristics information includes sound environment 
information of a space where the user consumes the audio 
signal and, at the step b) , the audio signal is adapted and 
transmitted to the user terminal by removing adverse 

30 effects caused by the sound environment of the user among 
the sound field characteristics of the audio signal based 
on the. sound environment information. 

15. The method as recited in claim 14, wherein the 
35 sound environment information includes reverberation time 

22 
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information of the space. 

16. The method as recited in claim 14 , wherein the 
sound environment information includes initial decay time 

5 of the space. 

17. The method as recited in claim 14, wherein the 
sound environment information includes energy ratio 
information between direct sound of the space and reflected 

10 sound after a predetermined time. 

18. The method as recited in claim 14, wherein the 
sound environment information is a physical quantity that 
indicates the sense of sound spread, and the sound 

15 environment information includes similarity information of 

sound that arrives at each ear of the user. 
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FIG. 3 



Audio usage environment information 



— S301 



Audio contents/metadata 



~ S303 



Adaptation 



— S305 



Adapted audio contents/metadata — - S307 



FIG. 4 



Adaptation 






r 


Audio contents/metadata confirmation 


^ 


r 


Audio contents adaptation 




r 


Audio metadata adaptation 



— S305 



— S403 



— S405 



- . A ir J [■ /' „ w ~~ APR Z005 



WO 2004/036954 



10/531635 

PCT/KR2003/002148 



4/4 



FIG. 5 



Amplitude 



Amplitude A 




Time 
Audio Contents 



dB 



Audio 
Adaptation Unit 



201 Time 
1 Audio contents to which 



Ro 



R 



R 



lo h I: 



sound field is applied 



Amplitude 



f\ 



fo +~ 

v Time 



Impulse 
Response 



FIG. 6 



R 




-•-Time 



low 



Time 



mid 



high 



--t — r- 



i r freq 



JC13Rec'dPCHFT0 14APRZ0Q9 . 



INTERNATIONAL SEARCH REPORT 



temational application No. 
PCT/KR2003/002148 



A. CLASSIFICATION OF SUBJECT MATTER 
BPC7 H04S 7/00 

According to International Patent Classification (IPC) or to both national classification and IPC 

B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by classification symbols) 

IPC 7 H04S, H04N, H04R 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 
KR, JP: IPC as above 



Electronic data base consulted during the intertnational search (name of data base and, where practicable, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



Citation of document, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



KR 1020030022842A (KOREAN INFORMATION AND COMMUNICATIONS) 17 MARCH. 
2003 

See the whole document 

US 2003007341 1 Al (William K. Meade) 17 APRIL . 2003 
See the whole document 

US 200220120925A1 (James D. Logan) 29 AUGUST. 2002 
See the whole document 



[ 1 Further documents are listed in the continuation of Box C. 



See patent family annex. 



* Special categories of cited documents: 

"A" document defining the general state of the art which is not considered 

to be of particular relevance 
"E" earlier application or patentbut published on or after the international 

filing date 

"L" document which may throw doubts on priority claim(s) or which is 
cited to establish the publication date of citation or other 
special reason (as specified) 

"O" document referring to an oral disclosure, use, exhibition or other 
means 

"P" document published prior to the international filing date but later 
than the priority date claimed 



"T" later document published after the international filing date or priority 

date and not in conflict with the application but cited to understand 

the principle or theory underlying the invention 
"X" document of particular relevance; the claimed invention cannot be 

considered novel or cannot be considered to involve an inventive 

step when the document is taken alone 
"Y" document of particular relevance; the claimed invention cannot be 

considered to involve an inventive step when the document is 

combined with one or more other such documents,such combination 

being obvious to a person skilled in the art 

document member of the same patent family 



Date of the actual completion of the international search 
20 JANUARY 2004 (20.01.2004) 


Date of mailing of the international search report 
20 JANUARY 2004 (20.01.2004) 


Name and mailing address of the ISA/KR 

Korean Intellectual Property Office 
M 920 Dunsan-dong, Seo-gu, Daejeon 302-701, 
jV Republic of Korea 

Facsimile No. 82-42-472-7140 


Authorized officer 

KIM, Seung Jo / ji j^jH 1 
Telephone No. 0421)481-5675 



Form PCTASA/2I0 (second sheet) (January 2004) 



INTERNATIONAL SEARCH REPORT , 

I international application No. 
iniormation on patent family members PCT/ECR2003/002148 



Patent document Publication Patent family Publication 

cited in search, report date member(s) date 



KR 1020030022842A 17-03-03 None 

US 2003007341 1A1 17-04-03 None 

US 200220120925A1 29-08-02 None 



Form PCT/ISA/210 (patent family annex) (January 2004)" " ^ 



